Page segmentation and text extraction from gray-scale images in microfilm format
نویسندگان
چکیده
The paper deals with a suitably designed system that is being used to separate textual regions from graphics regions and locate textual data from textured background. We presented a method based on edge detection to automatically locate text in some noise infected grayscale newspaper images with microfilm format. The algorithm first finds the appropriate edges of textual region using Canny edge detector, and then by edge merging it makes use of edge features to do block segmentation and classification, afterwards feature aided connected component analysis was used to group homogeneous textual regions together within the scope of its bounding box. We can obtain an efficient block segmentation with reduced memory size by introducing the TLC. The proposed method has been used to locate text in a group of newspaper images with multiple page layout. Initial results are encouraging, we would expand the experiment data to over 300 microfilm images with different layout structures, promising result is anticipated with corresponding modification on the prototype of former algorithm to make it more robust and suitable to different cases.
منابع مشابه
Extraction and 3D Segmentation of Tumors-Based Unsupervised Clustering Techniques in Medical Images
Introduction The diagnosis and separation of cancerous tumors in medical images require accuracy, experience, and time, and it has always posed itself as a major challenge to the radiologists and physicians. Materials and Methods We Received 290 medical images composed of 120 mammographic images, LJPEG format, scanned in gray-scale with 50 microns size, 110 MRI images including of T1-Wighted, T...
متن کاملDocument Analysis And Classification Based On Passing Window
In this paper we present Document analysis and classification system to segment and classify contents of Arabic document images. This system includes preprocessing, document segmentation, feature extraction and document classification. A document image is enhanced in the preprocessing by removing noise, binarization, and detecting and correcting image skew. In document segmentation, an algorith...
متن کاملUsing Irregular Pyramid for Text Segmentation and Binarization of Gray Scale Images
Compared to binary images that most text extraction methods work on, gray scale images provide much more information for the extraction task. On the other hand complication also arises in determining the subject textual content from its background region (ie. thresholding) before the actual text extraction process can begin. Differing from the usual sequence of processes where document images a...
متن کاملEvaluation of gray scale changes of CBCT system images in different axis using the DICOM file
The images of dental CBCT imaging systems used in conic shaped beams, stored in the DICOM format, have various applications in the dentistry, including bone density estimation to select the location of the orthodontic implant, bone loss detection and etc. In these systems, unlike CT imaging systems, the resulting images exhibit gray-scale non-uniformity in each of the different axis in FOV. Thi...
متن کاملUsing Irregular Pyramid for Text Segmentation and Binarization of Gray Scale Image
Compared to binary images that most text extraction methods work on, gray scale images provides much more information for the extraction task. On the other hand complication also arises in determining the subject textual content from its background region (ie. thresholding) before the actual text extraction process can begin. Differing from the usual sequence of processes where document images ...
متن کامل